SEPIA: Surface Span Extension to Syntactic Dependency Precision-based MT Evaluation

نویسندگان

  • Nizar Habash
  • Ahmed Elkholy
چکیده

We present a new Machine Translation (MT) evaluation metric, SEPIA. SEPIA falls within the class of syntactically-aware evaluation metrics, which have been getting a lot of attention recently (Liu and Gildea, 2005; Owczarzak et al., 2007; Giménez and Màrquez, 2007). Specifically, SEPIA uses dependency representation but extends it to include surface span as a factor in the evaluation score. The dependency surface span is the surface distance between two words that are in a direct relationship in a dependency tree. The basic idea behind SEPIA is that long-distance dependencies should receive a greater weight in MT evaluation metrics than shortdistance dependencies. This is because we suspect that having more long-distance matches indicates a higher degree of grammaticality. In the rest of this document we describe the SEPIA metric and its variants, and the publicly available SEPIA package.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dependency-Based Automatic Evaluation for Machine Translation

We present a novel method for evaluating the output of Machine Translation (MT), based on comparing the dependency structures of the translation and reference rather than their surface string forms. Our method uses a treebank-based, widecoverage, probabilistic Lexical-Functional Grammar (LFG) parser to produce a set of structural dependencies for each translation-reference sentence pair, and th...

متن کامل

A New Syntactic Metric for Evaluation of Machine Translation

Machine translation (MT) evaluation aims at measuring the quality of a candidate translation by comparing it with a reference translation. This comparison can be performed on multiple levels: lexical, syntactic or semantic. In this paper, we propose a new syntactic metric for MT evaluation based on the comparison of the dependency structures of the reference and the candidate translations. The ...

متن کامل

A Customizable MT Evaluation Metric for Assessing Adequacy Machine Translation Term Project

This project describes a customizable MT evaluation metric that provides system-dependent scores for the purposes of tuning an MT system. The features presented focus on assessing adequacy over uency. Rather than simply examining features, this project frames the MT evaluation task as a classi cation question to determine whether a given sentence was produced by a human or a machine. Support Ve...

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

A Weakly-Supervised Rule-Based Approach for Relation Extraction

Resumen Rule-based approaches for information extraction usually achieve good precision values, even if they often need a lot of manual effort to be implemented. In this paper, we present a novel rule-based strategy for semantic relation extraction that takes advantage of partial syntactic parsing in order to simplify the linguistic structures containing instances of semantic relations. We also...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008